AITopics | interaction step

Collaborating Authors

interaction step

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

1eeacdf8770e6dd5164cdeec8bcfa8cc-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 17:31:44 GMT

large language model, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Maryland (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

We tested our method on Humanoid-v2 and confirmed our method works

Neural Information Processing SystemsFeb-8-2026, 22:27:25 GMT

We thank the reviewers for the reviews, providing meaningful insight with constructive feedback. The result was reversed in Hopper, where RL contributed 200.86 while EA actors did 363.53. Therefore, all performance result scores are measured in the fixed interaction step. R2: Ablation study is missing. We presented the effect of the variance update rule in Appendix C.3 by comparing the result Then, we provided all combinations of our proposed mean and variance in Table 2. We will add a section so that it can be seen at a glance.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback

Semantic HELM: A Human-Readable Memory for Reinforcement Learning Fabian Paischer 1, Thomas Adler

Neural Information Processing SystemsFeb-8-2026, 17:46:02 GMT

In this regard, we propose a novel memory mechanism that represents past events in human language.

large language model, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(21 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback

SUPPLEMENTARY MATERIAL Deep Reinforcement Learning with Stacked Hierarchical Attention for T based Games

Neural Information Processing SystemsNov-15-2025, 05:53:22 GMT

Figure 1 shows an example of the raw interface of the game "ztuu", where raw textual observations In this section, we show the first 15 interaction steps of two games: "zork1" and "ztuu". C h o s e n a c t i o n a n d r e w a r d A c t i o n: w e s t Reward: 0 | S c o r e: 0 ===== S t e p 2 ===== ===== 1 . C h o s e n a c t i o n a n d r e w a r d A c t i o n: s o u t h Reward: 0 | S c o r e: 0 ===== S t e p 3 ===== 16 ===== 1 . C h o s e n a c t i o n a n d r e w a r d A c t i o n: s o u t h Reward: 0 | S c o r e: 0 ===== S t e p 4 ===== ===== 1 . C h o s e n a c t i o n a n d r e w a r d A c t i o n: w e s t Reward: 0 | S c o r e: 0 ===== S t e p 5 ===== ===== 1 .

baby rune, deep reinforcement learning, stacked hierarchical attention, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

Inference-Time Personalized Alignment with a Few User Preference Queries

Pădurean, Victor-Alexandru, Kamalaruban, Parameswaran, Kotalwar, Nachiket, Gotovos, Alkis, Singla, Adish

arXiv.org Artificial IntelligenceNov-6-2025

We study the problem of aligning a generative model's response with a user's preferences. Recent works have proposed several different formulations for personalized alignment; however, they either require a large amount of user preference queries or require that the preference be explicitly specified as a text input. In this paper, we propose a novel inference-time personalized alignment method, UserAlign, that elicits the user's preferences with a few queries as pairwise response comparisons. In particular, UserAlign builds on the theoretical framework of best-arm identification in logistic bandits and selects a personalized response from a fixed pool of the model's generated responses. The key idea is to consider the user's feedback consistent and noise-free, and incorporate it into the theoretical framework to identify the best response quickly. Experimental results across several tasks, involving personalized text and image generation, showcase the effectiveness of UserAlign in achieving personalized alignment.

machine learning, natural language, ser, (18 more...)

arXiv.org Artificial Intelligence

2511.02966

Country: Europe (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

731309c4bb223491a9f67eac5214fb2e-AuthorFeedback.pdf

Neural Information Processing SystemsOct-3-2025, 05:57:20 GMT

actor, algorithm, reviewer, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.52)

Add feedback

Information Seeking for Robust Decision Making under Partial Observability

Fang, Djengo Cyun-Jyun, Ke, Tsung-Wei

arXiv.org Artificial IntelligenceOct-3-2025

Explicit information seeking is essential to human problem-solving in practical environments characterized by incomplete information and noisy dynamics. When the true environmental state is not directly observable, humans seek information to update their internal dynamics and inform future decision-making. Although existing Large Language Model (LLM) planning agents have addressed observational uncertainty, they often overlook discrepancies between their internal dynamics and the actual environment. We introduce Information Seeking Decision Planner (InfoSeeker), an LLM decision-making framework that integrates task-oriented planning with information seeking to align internal dynamics and make optimal decisions under uncertainty in both agent observations and environmental dynamics. InfoSeeker prompts an LLM to actively gather information by planning actions to validate its understanding, detect environmental changes, or test hypotheses before generating or revising task-oriented plans. To evaluate InfoSeeker, we introduce a novel benchmark suite featuring partially observable environments with incomplete observations and uncertain dynamics. Experiments demonstrate that InfoSeeker achieves a 74% absolute performance gain over prior methods without sacrificing sample efficiency. Moreover, InfoSeeker generalizes across LLMs and outperforms baselines on established benchmarks such as robotic manipulation and web navigation. These findings underscore the importance of tightly integrating planning and information seeking for robust behavior in partially observable environments. The project page is available at https://infoseekerllm.github.io

information, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.01531

Genre:

Workflow (0.94)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

CLAUSE: Agentic Neuro-Symbolic Knowledge Graph Reasoning via Dynamic Learnable Context Engineering

Zhao, Yang, Dai, Chengxiao, Zhuo, Wei, Xiu, Yue, Niyato, Dusit

arXiv.org Artificial IntelligenceSep-26-2025

Knowledge graphs provide structured context for multi-hop question answering, but deployed systems must balance answer accuracy with strict latency and cost targets while preserving provenance. Static k-hop expansions and "think-longer" prompting often over-retrieve, inflate context, and yield unpredictable runtime. We introduce CLAUSE, an agentic three-agent neuro-symbolic framework that treats context construction as a sequential decision process over knowledge graphs, deciding what to expand, which paths to follow or backtrack, what evidence to keep, and when to stop. Latency (interaction steps) and prompt cost (selected tokens) are exposed as user-specified budgets or prices, allowing per-query adaptation to trade-offs among accuracy, latency, and cost without retraining. CLAUSE employs the proposed Lagrangian-Constrained Multi-Agent Proximal Policy Optimization (LC-MAPPO) algorithm to coordinate three agents: Subgraph Architect, Path Navigator, and Context Curator, so that subgraph construction, reasoning-path discovery, and evidence selection are jointly optimized under per-query resource budgets on edge edits, interaction steps, and selected tokens. Across HotpotQA, MetaQA, and FactKG, CLAUSE yields higher EM@1 while reducing subgraph growth and end-to-end latency at equal or lower token budgets. On MetaQA-2-hop, relative to the strongest RAG baseline (GraphRAG), CLAUSE achieves +39.3 EM@1 with 18.6% lower latency and 40.9% lower edge growth. The resulting contexts are compact, provenance-preserving, and deliver predictable performance under deployment constraints.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.21035

Genre: Research Report (0.50)

Technology: